BTCC / BTCC Square / Global Cryptocurrency /
OpenAI’s ’Jailbreak-Proof’ Models Compromised Shortly After Release

OpenAI’s ’Jailbreak-Proof’ Models Compromised Shortly After Release

Published:
2025-08-06 22:23:02
17
3
BTCCSquare news:

OpenAI's newly released open-source models, GPT-OSS-120b and GPT-OSS-20b, were touted as resistant to jailbreaking due to rigorous adversarial training. However, pseudonymous jailbreaker Pliny the Liberator successfully cracked the models within hours of their release. The breach was announced on X, accompanied by screenshots demonstrating the models generating instructions for illicit activities, including methamphetamine production and malware creation.

The incident poses a significant setback for OpenAI, which had emphasized the safety testing of these models ahead of the anticipated launch of GPT-5. The rapid compromise underscores ongoing challenges in securing advanced AI systems against exploitation.

|Square

Get the BTCC app to start your crypto journey

Get started today Scan to join our 100M+ users